-
Notifications
You must be signed in to change notification settings - Fork 3
csv_reader declaration to include columns with spaces #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
not yet, this is a limitation with the custom parser But it should! Like you mentioned there are workaround if you use |
Alex, To the best of my understanding, columns with (unfortunate) spaces can be handled in SQLite in 3 ways: When creating a VT like: CREATE VIRTUAL TABLE IF NOT EXISTS temp.xsv_5m_Sales_Records_Test USING csv(filename='D:\Temp\5m-Sales-Records.csv.zst', header=false,Region TEXT, Country TEXT, [Item Type] TEXT it fails. When replacing "[]" with single quotes, the creation goes through but the results are:
Using alternate delimiters like '`' (backtick) allows the table creation to complete but the column will assume the generic name of 'Column1'
|
Alex, if I understand correctly the code despite my turkey induced drowsiness, the column space issue should be resolved in the sqlite-loadable-rs (vtab_argparse.rs). This is an attempt. Let me know your thoughts. pub fn arg_is_column_declaration(arg: &str) -> Result<Option<ColumnDeclaration>, String> {
if arg.trim().is_empty() {
return Ok(None);
}
// TODO this is a bit of a hack, but it works for now
// NOTE Test if 'arg' contains characters that imply the column name has a space.
let regex_check = Regex::new(r#"[\['"]"#).unwrap();
// NOTE Assign an appropriate regex string based on the above check.
let regex_string = if regex_check.is_match(arg) {
r#"^[\['"](?<name>.*?)[\]'"]\s+(?<declared_type>.*?)\s+(?<constraints>.*)$"#
} else {
r#"^(?<name>.*?)\s+(?<declared_type>.*?)\s+(?<constraints>.*)$"#
};
// NOTE Compile the regex string into a regex object.
let regex_result = Regex::new(regex_string);
// NOTE Check if the regex compiled successfully and assign the results to variables.
match regex_result {
Ok(re) => match re.captures(arg) {
Some(caps) => {
/*
// TEST
println!(
"Found match: Name: {0} - Declared Type: {1} - Constraints: {2}",
&caps["name"].trim(),
&caps["declared_type"].trim(),
&caps["constraints"].trim()
);
*/
// TODO Verify scope.
name = &caps["name"].trim();
declared_type = &caps["declared_type"].trim();
constraints = &caps["constraints"].trim();
}
None => {
println!("No match found!");
}
},
Err(err) => {
println!("Could not compile regex: {}", err);
}
}
Ok(Some(ColumnDeclaration::new(
name,
declared_type,
constraints,
)))
} Happy TG! |
While in agreement that spaces should never be used in a column name, at times, when dealing with externally sourced csvs, spaces are indeed present.
When attempting to create a virtual table that includes spaces, a 'vtable constructor failed: iavm_reader' error is returned.
e.g.
CREATE VIRTUAL TABLE temp.iavm_reader using csv_reader(plugin, [plugin name])
Is there a way to work around this?
PS
When creating a virtual table directly there are no issues with the header.
e.g.
CREATE VIRTUAL TABLE temp.iavm_csv using csv(filename=iavm.csv)
The text was updated successfully, but these errors were encountered: